home *** CD-ROM | disk | FTP | other *** search
- From: goer@midway.uchicago.edu (Richard L. Goerwitz)
- Newsgroups: comp.sources.misc
- Subject: v16i085: mtf - Map tar filenames, Part01/02
- Message-ID: <1991Jan29.012143.21569@sparky.IMD.Sterling.COM>
- Date: 29 Jan 91 01:21:43 GMT
- Approved: kent@sparky.imd.sterling.com
- X-Checksum-Snefru: 5d9f6bcf 21d03491 c3363fc3 4c90c7a2
-
- Submitted-by: goer@midway.uchicago.edu (Richard L. Goerwitz)
- Posting-number: Volume 16, Issue 85
- Archive-name: mtf/part01
-
- Tar archives often come packed with filenames longer than 15 chars,
- and with source code that requires that the filenames be fully pre-
- served. This utility, mtf, runs through the tar headers, finds all
- overlong filenames, renames them, renames them in any text files it
- finds, and then rewrites the tar header checksums.
-
- -Richard
-
- ---- Cut Here and feed the following to sh ----
- #!/bin/sh
- # This is a shell archive (produced by shar 3.49)
- # To extract the files from this archive, save it to a file, remove
- # everything above the "!/bin/sh" line above, and type "sh file_name".
- #
- # made 01/20/1991 23:34 UTC by goer@sophist.uchicago.edu
- # Source directory /u/richard/Mtf
- #
- # existing files will NOT be overwritten unless -c is specified
- # This format requires very little intelligence at unshar time.
- # "if test", "cat", "rm", "echo", "true", and "sed" may be needed.
- #
- # This is part 1 of a multipart archive
- # do not concatenate these parts, unpack them in order with /bin/sh
- #
- # This shar contains:
- # length mode name
- # ------ ---------- ------------------------------------------
- # 16721 -r--r--r-- mtf.icn
- # 3341 -rw-r--r-- README
- # 659 -rw-r--r-- Makefile.dist
- #
- if test -r _shar_seq_.tmp; then
- echo 'Must unpack archives in sequence!'
- echo Please unpack part `cat _shar_seq_.tmp` next
- exit 1
- fi
- # ============= mtf.icn ==============
- if test -f 'mtf.icn' -a X"$1" != X"-c"; then
- echo 'x - skipping mtf.icn (File already exists)'
- rm -f _shar_wnt_.tmp
- else
- > _shar_wnt_.tmp
- echo 'x - extracting mtf.icn (Text)'
- sed 's/^X//' << 'SHAR_EOF' > 'mtf.icn' &&
- X#############################################################################
- X#
- X# NAME: mtf3.icn
- X#
- X# TITLE: map tar file
- X#
- X# AUTHOR: Richard Goerwitz
- X#
- X# VERSION: 3.3
- X#
- X#############################################################################
- X#
- X# This and future versions of mtf are hereby placed in the public domain -RLG
- X#
- X#############################################################################
- X#
- X# PURPOSE: Maps 15+ char. filenames in a tar archive to 14 chars.
- X# Handles both header blocks and the archive itself. Mtf is intended
- X# to facilitate installation of tar'd archives on systems subject to
- X# the System V 14-character filename limit.
- X#
- X# USAGE: mtf inputfile [-r reportfile] [-e .extensions] [-x exceptions]
- X#
- X# "Inputfile" is a tar archive. "Reportfile" is file containing a
- X# list of files already mapped by mtf in a previous run (used to
- X# avoid clashes with filenames in use outside the current archive).
- X# The -e switch precedes a list of filename .extensions which mtf is
- X# supposed to leave unscathed by the mapping process
- X# (single-character extensions such as .c and .o are automatically
- X# preserved; -e allows the user to specify additional extensions,
- X# such as .pxl, .cpi, and .icn). The final switch, -x, precedes a
- X# list of strings which should not be mapped at all. Use this switch
- X# if, say, you have a C file with a structure.field combination such
- X# as "thisisveryverybig.hashptr" in an archive that contains a file
- X# called "thisisveryverybig.h," and you want to avoid mapping that
- X# portion of the struct name which matches the name of the overlong
- X# file (to wit, "mtf inputfile -x thisisveryverybig.hashptr"). To
- X# prevent mapping of any string (including overlong filenames) begin-
- X# ning, say, with "thisisvery," use "mtf inputfile -x thisisvery."
- X# Be careful with this option, or you might end up defeating the
- X# whole point of using mtf in the first place.
- X#
- X# OUTPUT FORMAT: Mtf writes a mapped tar archive to the stdout.
- X# When finished, it leaves a file called "map.report" in the current
- X# directory which records what filenames were mapped and how. Rename
- X# and save this file, and use it as the "reportfile" argument to any
- X# subsequent runs of mtf in this same directory. Even if you don't
- X# plan to run mtf again, this file should still be examined, just to
- X# be sure that the new filenames are acceptable, and to see if
- X# perhaps additional .extensions and/or exceptions should be
- X# specified.
- X#
- X# BUGS: Mtf only maps filenames found in the main tar headers.
- X# Because of this, mtf cannot accept nested tar archives. If you try
- X# to map a tar archive within a tar file, mtf will abort with a nasty
- X# message about screwing up your files. Please note that, unless you
- X# give mtf a "reportfile" to consider, it knows nothing about files
- X# existing outside the archive. Hence, if an input archive refers to
- X# an overlong filename in another archive, mtf naturally will not
- X# know to shorten it. Mtf will, in fact, have no way of knowing that
- X# it is a filename, and not, say, an identifier in a C program.
- X# Final word of caution: Try not to use mtf on binaries. It cannot
- X# possibly preserve the correct format and alignment of strings in an
- X# executable. Same goes for compressed files. Mtf can't map
- X# filenames that it can't read!
- X#
- X####################################################################
- X
- X
- Xglobal filenametbl, chunkset, short_chunkset # see procedure mappiece(s)
- Xglobal extensions, no_nos # ditto
- X
- Xrecord hblock(name,junk,size,mtime,chksum, # tar header struct;
- X linkflag,linkname,therest) # see readtarhdr(s)
- X
- X
- Xprocedure main(a)
- X
- X usage := "usage: mtf inputfile [-r reportfile] " ||
- X "[-e .extensions] [-x exceptions]"
- X
- X *a = 0 & stop(usage)
- X
- X intext := open_input_file(a[1]) & pop(a)
- X
- X i := 0
- X extensions := []; no_nos := []
- X while (i +:= 1) <= *a do {
- X case a[i] of {
- X "-r" : readin_old_map_report(a[i+:=1])
- X "-e" : current_list := extensions
- X "-x" : current_list := no_nos
- X default : put(current_list,a[i])
- X }
- X }
- X
- X every !extensions ?:= (=".", tab(0))
- X
- X # Run through all the headers in the input file, filling
- X # (global) filenametbl with the names of overlong files;
- X # make_table_of_filenames fails if there are no such files.
- X make_table_of_filenames(intext) | {
- X write(&errout,"mtf: no overlong path names to map")
- X a[1] ? (tab(find(".tar")+4), pos(0)) |
- X write(&errout,"(Is ",a[1]," even a tar archive?)")
- X exit(1)
- X }
- X
- X # Now that a table of overlong filenames exists, go back
- X # through the text, remapping all occurrences of these names
- X # to new, 14-char values; also, reset header checksums, and
- X # reformat text into correctly padded 512-byte blocks. Ter-
- X # minate output with 512 nulls.
- X seek(intext,1)
- X every writes(output_mapped_headers_and_texts(intext))
- X
- X close(intext)
- X write_report() # Record mapped file and dir names for future ref.
- X exit(0)
- X
- Xend
- X
- X
- X
- Xprocedure open_input_file(s)
- X intext := open("" ~== s,"r") |
- X stop("mtf: can't open ",s)
- X find("UNIX",&features) |
- X stop("mtf: I'm not tested on non-Unix systems.")
- X s[-2:0] == ".Z" &
- X stop("mtf: sorry, can't accept compressed files")
- X return intext
- Xend
- X
- X
- X
- Xprocedure readin_old_map_report(s)
- X
- X initial {
- X filenametbl := table()
- X chunkset := set()
- X short_chunkset := set()
- X }
- X
- X mapfile := open_input_file(s)
- X while line := read(mapfile) do {
- X line ? {
- X if chunk := tab(many(~' \t')) & tab(upto(~' \t')) &
- X lchunk := move(14) & pos(0) then {
- X filenametbl[chunk] := lchunk
- X insert(chunkset,chunk)
- X insert(short_chunkset,chunk[1:16])
- X }
- X if /chunk | /lchunk
- X then stop("mtf: report file, ",s," seems mangled.")
- X }
- X }
- X
- Xend
- X
- X
- X
- Xprocedure make_table_of_filenames(intext)
- X
- X local header # chunkset is global
- X
- X # search headers for overlong filenames; for now
- X # ignore everything else
- X while header := readtarhdr(reads(intext,512)) do {
- X # tab upto the next header block
- X tab_nxt_hdr(intext,trim_str(header.size),1)
- X # record overlong filenames in several global tables, sets
- X fixpath(trim_str(header.name))
- X }
- X *\chunkset ~= 0 | fail
- X return &null
- X
- Xend
- X
- X
- X
- Xprocedure output_mapped_headers_and_texts(intext)
- X
- X # Remember that filenametbl, chunkset, and short_chunkset
- X # (which are used by various procedures below) are global.
- X local header, newtext, full_block, block, lastblock
- X
- X # Read in headers, one at a time.
- X while header := readtarhdr(reads(intext,512)) do {
- X
- X # Replace overlong filenames with shorter ones, according to
- X # the conversions specified in the global hash table filenametbl
- X # (which were generated by fixpath() on the first pass).
- X header.name := left(map_filenams(header.name),100,"\x00")
- X header.linkname := left(map_filenams(header.linkname),100,"\x00")
- X
- X # Use header.size field to determine the size of the subsequent text.
- X # Read in the text as one string. Map overlong filenames found in it
- X # to shorter names as specified in the global hash table filenamtbl.
- X newtext := map_filenams(tab_nxt_hdr(intext,trim_str(header.size)))
- X
- X # Now, find the length of newtext, and insert it into the size field.
- X header.size := right(exbase10(*newtext,8) || " ",12," ")
- X
- X # Calculate the checksum of the newly retouched header.
- X header.chksum := right(exbase10(get_checksum(header),8)||"\x00 ",8," ")
- X
- X # Finally, join all the header fields into a new block and write it out
- X full_block := ""; every full_block ||:= !header
- X suspend left(full_block,512,"\x00")
- X
- X # Now we're ready to write out the text, padding the final block
- X # out to an even 512 bytes if necessary; the next header must start
- X # right at the beginning of a 512-byte block.
- X newtext ? {
- X while block := move(512)
- X do suspend block
- X pos(0) & next
- X lastblock := left(tab(0),512,"\x00")
- X suspend lastblock
- X }
- X }
- X # Write out a final null-filled block. Some tar programs will write
- X # out 1024 nulls at the end. Dunno why.
- X return repl("\x00",512)
- X
- Xend
- X
- X
- X
- Xprocedure trim_str(s)
- X
- X # Knock out spaces, nulls from those crazy tar header
- X # block fields (some of which end in a space and a null,
- X # some just a space, and some just a null [anyone know
- X # why?]).
- X return s ? {
- X (tab(many(' ')) | &null) &
- X trim(tab(find("\x00")|0))
- X } \ 1
- X
- Xend
- X
- X
- X
- Xprocedure tab_nxt_hdr(f,size_str,firstpass)
- X
- X # Tab upto the next header block. Return the bypassed text
- X # as a string if not the first pass.
- X
- X local hs, next_header_offset
- X
- X hs := integer("8r" || size_str)
- X next_header_offset := (hs / 512) * 512
- X hs % 512 ~= 0 & next_header_offset +:= 512
- X if 0 = next_header_offset then return ""
- X else {
- X # if this is pass no. 1 don't bother returning a value; we're
- X # just collecting long filenames;
- X if \firstpass then {
- X seek(f,where(f)+next_header_offset)
- X return
- X }
- X else {
- X return reads(f,next_header_offset)[1:hs+1] |
- X stop("mtf: error reading in ",
- X string(next_header_offset)," bytes.")
- X }
- X }
- X
- Xend
- X
- X
- X
- Xprocedure fixpath(s)
- X
- X # Fixpath is a misnomer of sorts, since it is used on
- X # the first pass only, and merely examines each filename
- X # in a path, using the procedure mappiece to record any
- X # overlong ones in the global table filenametbl and in
- X # the global sets chunkset and short_chunkset; no fixing
- X # is actually done here.
- X
- X s2 := ""
- X s ? {
- X while piece := tab(find("/")+1)
- X do s2 ||:= mappiece(piece)
- X s2 ||:= mappiece(tab(0))
- X }
- X return s2
- X
- Xend
- X
- X
- X
- Xprocedure mappiece(s)
- X
- X # Check s (the name of a file or dir as recorded in the tar header
- X # being examined) to see if it is over 14 chars long. If so,
- X # generate a unique 14-char version of the name, and store
- X # both values in the global hashtable filenametbl. Also store
- X # the original (overlong) file name in chunkset. Store the
- X # first fifteen chars of the original file name in short_chunkset.
- X # Sorry about all of the tables and sets. It actually makes for
- X # a reasonably efficient program. Doing away with both sets,
- X # while possible, causes a tenfold drop in execution speed!
- X
- X # global filenametbl, chunkset, short_chunkset, extensions
- X local j, ending
- X
- X initial {
- X /filenametbl := table()
- X /chunkset := set()
- X /short_chunkset := set()
- X }
- X
- X chunk := trim(s,'/')
- X if chunk ? (tab(find(".tar")+4), pos(0)) then {
- X write(&errout, "mtf: Sorry, I can't let you do this.\n",
- X " You've nested a tar archive within\n",
- X " another tar archive, which makes it\n",
- X " likely I'll f your filenames ubar.")
- X exit(2)
- X }
- X if *chunk > 14 then {
- X i := 0
- X
- X if /filenametbl[chunk] then {
- X # if we have not seen this file, then...
- X repeat {
- X # ...find a new unique 14-character name for it;
- X # preserve important suffixes like ".Z," ".c," etc.
- X # First, check to see if the original filename (chunk)
- X # ends in an important extension...
- X if chunk ?
- X (tab(find(".")),
- X ending := move(1) || tab(match(!extensions)|any(&ascii)),
- X pos(0)
- X )
- X # ...If so, then leave the extension alone; mess with the
- X # middle part of the filename (e.g. file.with.extension.c ->
- X # file.with001.c).
- X then {
- X j := (15 - *ending - 3)
- X lchunk:= chunk[1:j] || right(string(i+:=1),3,"0") || ending
- X }
- X # If no important extension is present, then reformat the
- X # end of the file (e.g. too.long.file.name -> too.long.fi01).
- X else lchunk := chunk[1:13] || right(string(i+:=1),2,"0")
- X
- X # If the resulting shorter file name has already been used...
- X if lchunk == !filenametbl
- X # ...then go back and find another (i.e. increment i & try
- X # again; else break from the repeat loop, and...
- X then next else break
- X }
- X # ...record both the old filename (chunk) and its new,
- X # mapped name (lchunk) in filenametbl. Also record the
- X # mapped names in chunkset and short_chunkset.
- X filenametbl[chunk] := lchunk
- X insert(chunkset,chunk)
- X insert(short_chunkset,chunk[1:16])
- X }
- X }
- X
- X # If the filename is overlong, return lchunk (the shortened
- X # name), else return the original name (chunk). If the name,
- X # as passed to the current function, contained a trailing /
- X # (i.e. if s[-1]=="/"), then put the / back. This could be
- X # done more elegantly.
- X return (\lchunk | chunk) || ((s[-1] == "/") | "")
- X
- Xend
- X
- X
- X
- Xprocedure readtarhdr(s)
- X
- X # Read the silly tar header into a record. Note that, as was
- X # complained about above, some of the fields end in a null, some
- X # in a space, and some in a space and a null. The procedure
- X # trim_str() may (and in fact often _is_) used to remove this
- X # extra garbage.
- X
- X this_block := hblock()
- X s ? {
- X this_block.name := move(100) # <- to be looked at later
- X this_block.junk := move(8+8+8) # skip the permissions, uid, etc.
- X this_block.size := move(12) # <- to be looked at later
- X this_block.mtime := move(12)
- X this_block.chksum := move(8) # <- to be looked at later
- X this_block.linkflag := move(1)
- X this_block.linkname := move(100) # <- to be looked at later
- X this_block.therest := tab(0)
- X }
- X integer(this_block.size) | fail # If it's not an integer, we've hit
- X # the final (null-filled) block.
- X return this_block
- X
- Xend
- X
- X
- X
- Xprocedure map_filenams(s)
- X
- X # Chunkset is global, and contains all the overlong filenames
- X # found in the first pass through the input file; here the aim
- X # is to map these filenames to the shortened variants as stored
- X # in filenametbl (GLOBAL).
- X
- X local s2, tmp_chunk_tbl, tmp_lst
- X static new_chunklist
- X initial {
- X
- X # Make sure filenames are sorted, longest first. Say we
- X # have a file called long_file_name_here.1 and one called
- X # long_file_name_here.1a. We want to check for the longer
- X # one first. Otherwise the portion of the second file which
- X # matches the first file will get remapped.
- X tmp_chunk_tbl := table()
- X every el := !chunkset
- X do insert(tmp_chunk_tbl,el,*el)
- X tmp_lst := sort(tmp_chunk_tbl,4)
- X new_chunklist := list()
- X every put(new_chunklist,tmp_lst[*tmp_lst-1 to 1 by -2])
- X
- X }
- X
- X s2 := ""
- X s ? {
- X until pos(0) do {
- X # first narrow the possibilities, using short_chunkset
- X if member(short_chunkset,&subject[&pos:&pos+15])
- X # then try to map from a long to a shorter 14-char filename
- X then {
- X if match(ch := !new_chunklist) & not match(!no_nos)
- X then s2 ||:= filenametbl[=ch]
- X else s2 ||:= move(1)
- X }
- X else s2 ||:= move(1)
- X }
- X }
- X return s2
- X
- Xend
- X
- X
- X# From the IPL. Thanks, Ralph -
- X# Author: Ralph E. Griswold
- X# Date: June 10, 1988
- X# exbase10(i,j) convert base-10 integer i to base j
- X# The maximum base allowed is 36.
- X
- Xprocedure exbase10(i,j)
- X
- X static digits
- X local s, d, sign
- X initial digits := &digits || &lcase
- X if i = 0 then return 0
- X if i < 0 then {
- X sign := "-"
- X i := -i
- X }
- X else sign := ""
- X s := ""
- X while i > 0 do {
- X d := i % j
- X if d > 9 then d := digits[d + 1]
- X s := d || s
- X i /:= j
- X }
- X return sign || s
- X
- Xend
- X
- X# end IPL material
- X
- X
- Xprocedure get_checksum(r)
- X
- X # Calculates the new value of the checksum field for the
- X # current header block. Note that the specification say
- X # that, when calculating this value, the chksum field must
- X # be blank-filled.
- X
- X sum := 0
- X r.chksum := " "
- X every field := !r
- X do every sum +:= ord(!field)
- X return sum
- X
- Xend
- X
- X
- X
- Xprocedure write_report()
- X
- X # This procedure writes out a list of filenames which were
- X # remapped (because they exceeded the SysV 14-char limit),
- X # and then notifies the user of the existence of this file.
- X
- X local outtext, stbl, i, j, mapfile_name
- X
- X # Get a unique name for the map.report (thereby preventing
- X # us from overwriting an older one).
- X mapfile_name := "map.report"; j := 1
- X until not close(open(mapfile_name,"r"))
- X do mapfile_name := (mapfile_name[1:11] || string(j+:=1))
- X
- X (outtext := open(mapfile_name,"w")) |
- X open(mapfile_name := "/tmp/map.report","w") |
- X stop("mtf: Can't find a place to put map.report!")
- X stbl := sort(filenametbl,3)
- X every i := 1 to *stbl -1 by 2 do {
- X match(!no_nos,stbl[i]) |
- X write(outtext,left(stbl[i],35," ")," ",stbl[i+1])
- X }
- X write(&errout,"\nmtf: ",mapfile_name," contains the list of changes.")
- X write(&errout," Please save this list!")
- X close(outtext)
- X return &null
- X
- Xend
- SHAR_EOF
- true || echo 'restore of mtf.icn failed'
- rm -f _shar_wnt_.tmp
- fi
- # ============= README ==============
- if test -f 'README' -a X"$1" != X"-c"; then
- echo 'x - skipping README (File already exists)'
- rm -f _shar_wnt_.tmp
- else
- > _shar_wnt_.tmp
- echo 'x - extracting README (Text)'
- sed 's/^X//' << 'SHAR_EOF' > 'README' &&
- XNAME: mtf
- X
- XLANGUAGE: Icon
- X
- XAUTHOR: Richard Goerwitz (goer@sophist.uchicago.edu)
- X
- XPURPOSE: Maps 15+ char. filenames in a tar archive to 14 chars.
- XHandles both header blocks and the archive itself. Mtf is intended to
- Xfacilitate installation of tar'd archives on systems subject to a
- X14-character filename limit.
- X
- XINSTALLATION: Cp Makefile.dist to Makefile and make. If all goes
- Xwell, and you have root priviledges, edit the Makefile to reflect
- Xyour local file structure, and make install.
- X
- XUSAGE: mtf inputfile [-r reportfile] [-e .extensions] [-x exceptions]
- X
- X"Inputfile" is a tar archive. "Reportfile" is file containing a list
- Xof files already mapped by mtf in a previous run (used to avoid
- Xclashes with filenames in use outside the current archive). The -e
- Xswitch precedes a list of filename .extensions which mtf is supposed
- SHAR_EOF
- true || echo 'restore of README failed'
- fi
- echo 'End of part 1'
- echo 'File README is continued in part 2'
- echo 2 > _shar_seq_.tmp
- exit 0
-
- exit 0 # Just in case...
- --
- Kent Landfield INTERNET: kent@sparky.IMD.Sterling.COM
- Sterling Software, IMD UUCP: uunet!sparky!kent
- Phone: (402) 291-8300 FAX: (402) 291-4362
- Please send comp.sources.misc-related mail to kent@uunet.uu.net.
-